Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't mark the region tracker page as allocated #899

Merged
merged 3 commits into from
Nov 25, 2024

Conversation

mconst
Copy link
Contributor

@mconst mconst commented Nov 19, 2024

When writing the allocator state table, it's better not to include the region tracker page as one of the allocated pages. Instead, we should allocate the region tracker page during quick-repair after loading the rest of the allocator state, exactly like we do for full repair.

This avoids an unlikely corruption that could happen for databases larger than 4 TB, if we crash on shutdown after writing out a new region tracker page number but before clearing the recovery_required bit. It also means that in the future, when we drop support for the old allocator state format, we'll be able to get rid of the region tracker page entirely instead of having to keep around some of the allocation code for compatibility.

@mconst mconst marked this pull request as draft November 19, 2024 00:29
@mconst mconst force-pushed the region_tracker_alloc branch 2 times, most recently from ffb8f6d to b798a4a Compare November 25, 2024 01:36
@mconst
Copy link
Contributor Author

mconst commented Nov 25, 2024

Okay, I think this is ready. Sorry for the delay! The first two commits are fixes for preexisting region-tracker bugs I ran into; the last one is the main change.

@mconst mconst marked this pull request as ready for review November 25, 2024 01:54
@cberner
Copy link
Owner

cberner commented Nov 25, 2024

Ah nice, ya I like this! I think this conflicted with your other PR that I just merged. Can you rebase and I'll merge this one too?

If we relocate the region tracker page during compaction and then the
transaction aborts, the allocation of the new tracker page gets rolled
back, but the free of the old tracker page doesn't! This leaves us in
an inconsistent state with no tracker page allocated.

So we need to use allocate_non_transactional() here, which guarantees
the page will remain allocated even if the rest of the transaction rolls
back.
If repair fails, our allocator state is inconsistent; don't try to write
it to disk
When writing the allocator state table, it's better not to include the
region tracker page as one of the allocated pages. Instead, we should
allocate the region tracker page during quick-repair after loading the
rest of the allocator state, exactly like we do for full repair.

This avoids an unlikely corruption that could happen for databases
larger than 4 TB, if we crash on shutdown after writing out a new
region tracker page number but before clearing the recovery_required
bit. It also means that in the future, when we drop support for the old
allocator state format, we'll be able to get rid of the region tracker
page entirely instead of having to keep around some of the allocation
code for compatibility.
@mconst mconst force-pushed the region_tracker_alloc branch from b798a4a to 420791e Compare November 25, 2024 04:36
@mconst
Copy link
Contributor Author

mconst commented Nov 25, 2024

Oh yeah, sorry about that! Here's the rebased version.

@cberner cberner merged commit dc0460a into cberner:master Nov 25, 2024
3 checks passed
@mconst mconst deleted the region_tracker_alloc branch November 25, 2024 04:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants